Europaieches Patentamt 
European Patent Office 
Office europeen des brevets 



iiiHiniiiiiiiiiiiiii 

(ID EP 0 777 357 A2 



(43) Date of publication: 

04.06.1997 Bulletin 1997/23 



EUROPEAN PATENT APPLICATION 

(51) lntCI.6: H04L 12/24 



(21) Application number: 96308419.9 

(22) Date of filing: 21.11.1996 



(84) Designated Contracting States: 


(72) 


Inventor: Bondi, Andre B. 


DE FR GB 




Red Bank, NJ 07001 (US) 


(30) Priority: 28.11.1995 US 565180 


(74) 


Representative: Robinson, Robert George 




International Intellectual Property Department, 


(71) Applicant: NCR International, Inc. 




NCR Limited, 


Dayton, Ohio 45479 (US) 




206 Marylebone Road 






London NW1 6LY (GB) 



(54) Network management system with improved node discovery and monitoring 



(57) A method and system for monitoring nodes in 
a nelwork having at least one network management sta- 
tion and a plurality of nodes. A queue (10) stores polling 
messages for transmission to the nodes where each 
node is indexed by its network address. The network 
management station sends polling messages to the 
nodes in sequence at a predetermined rate controlled 
by a rate control mechanism (1 2). Polling messages are 
sent up tofourtimes to a particular node. The transmis- 
sion of these messages are recorded in a table which is 
indexed by the network address of each node, and by 
the time of the next scheduled timeout (the time period 
between successive polling messages) associated with 
each node. The network management station deter- 
mines if another polling message should be sent to each 
of the nodes. If the fourth polling message has been sent 
to a particu lar node, it has been unacknowledged by that 
node and the timeout has expired, then the node is de- 
termined to have faiied. 



CM 
< 
I s - 

8 

N 
O 
0- 

LLl 




EP0777357 A2 



Description 

The presen! invention relates to network manage- 
ment station and more particulariy to a network man- 
agement station which reduces the elapsed time in 
which a network's topology is discovered and updated. 

Large communication infrastructures, known as in- 
ternets, are composed of wide and local area networks 
and consist of end-systems, intermediate systems and 
media devices. Communication between nodes on the 
networks is governed by communication protocols, such 
as Ihe TCP/IP protocol. The end-systems include main- 
frames, workstations, printers and terminal servers. 
Intermediate systems typically include routers used to 
connect the networks together. The media devices, 
such as bridges, hubs and multiplexors, provide com- 
munication links between diflerenl end-systems in the 
network. In each network of an internet, the various end 
systems, intermediate systems and media devices are 
lypicaily manufactured by many different vendors and 
to manage these multi-vendor networks requires stand- 
ardised network management protocols. 

Generally, to support the communication network, 
network management personnel want to know what 
nodes are connected to the network, what each node is 
(e.g. a computer, router or printer), ihe status of each 
node, potential problems with the network, and if possi- 
ble any corrective measures that can be taken when ab- 
normal stalus, malfunction or other notifiable events are 
detected. 

To assist network management personnel in main- 
taining the oporation of the internet, a network manage- 
ment framework was developed to define rules describ- 
ing management information, a set of managed objects 
and a management protocol. One such protocol is the 
simple network management protocol (SNMP). 

Network management systems need to interact with 
existing hardware while minimising the host processor 
time needed to perform network management tasks. In 
network management, the host processor or network 
management station is known as the network manager. 
A network manager is typically an end-system, such as 
a mainframe or workstation, assigned to perform the 
network managing tasks. More than one end-system 
may be used as a network manager. The network man- 
ager is responsible for monitoring the operation of a 
number of end-systems, intermediate systems and me- 
dia devices, which are known as managed nodes. The 
network manager, the corresponding managed nodes 
and the data links there between are known as a subnet. 
Many different tasks are performed by the network man- 
ager. One such task is to initially discover the different 
nodes (e.g. end-systems, routers and media devices) 
connected to the network. After discovery, the network 
manager continuously determines how the network or- 
ganisation has changed. For example, the network 
manager determines what new nodes are connected to 
the network, Another task performed after discovery, is 



to determine which nodes on the network are operation- 
al. In other words, the network manager determines 
which nodes have failed. 

Once the nodes on the network are discovered and 

s their status ascertained, the information is stored in a 
database and network topology maps of the networks 
and/or subnets can be generated and displayed along 
with the status of the different nodes along the network 
to the network management personnel. Topology maps 

io assist the personnel in the trouble shooting of network 
problems and with the routing of communications along 
the networks, especially il nodes have failed. 

Through the discovery process, the network man- 
ager ascertains its internet protocol (IP) address, the 

is range of IP addresses for the subnet components (i.e. 
the s ubnet mask) , a roul in g tab le for a defau It roule r and 
address resolution protocol (ARP) cache tables Irom 
known and previously unknown nodes with SNMP 
agents. To ascertain the existence of network nodes, the 

so discovery process performs configuration polls of 
known nodes and retrieves the ARP cache tables from 
the known nodes, and the routing tables. The network 
manager then verifies the existence of those nodes list- 
ed in these tables that it has not previously recorded in 

2S its database. 

Examples of network manager systems are the 
Onevision™ network management station produced by 
AT&T and the Openview™ network manager produced 
by Hewlett Packard. Currently these systems discover 

30 nodes and verify the existence and slatus of nodes by 
sending to each node an internet control message pro- 
tocol (ICMP) poll and waiting for a response. The ICMP 
poll is also known as a ping. If no response is received 
after a specific period of time, the node is determined to 

3S be non-operational or 1o have failed. The change in sta- 
tus of the node is then reflected by the network manage- 
ment elation, for example, updating the topology map. 
Instances may occur when the ping is no1 received by 
the node, or the node is busy performing another task 

40 when the ping is set. Thus, to verify fhat a node has ac- 
tually failed, the network manager sends out a sequence 
of M pings, where M is an arbitrary but preferably a fixed 
number, such as four. Each successive ping is transmit- 
ted if a corresponding acknowledgement is not received 

45 during an associated scheduled timeout interval. Pref- 
erably, the timeout interval is increased for each succes- 
sive ping. The sequence of pings terminates either if one 
of the pings is acknowledged, or if no acknowledgement 
has been received after the limeout interval associated 

50 with the Mth ping has expired. If no response is received 
to the Mth ping, Ihe node is declared to be non-opera- 
ticnal ("down"). 

To illustrate, the network management station in the 
OpenView system sends an ICMP poll (ping) to a node 

55 and waits for a response. If no response is receivedfrom 
the first ping within ten seconds a second ping is sent 
out. If no response is received from the second ping 
within twenty seconds a third ping is sent out. If no re- 
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sponse is received from the third ping within forty sec- 
onds a fourth ping is sent out. If no response is received 
from the fourth ping within eighly seconds the node is 
declared down. The total time from when the first ping 
is sent to the determination that the node is down can 
take about 2.5 minutes. 

To prevenl an overfiow ol pings from occurring dur- 
ing, for example, initial discovery, these currenl systems 
limit the number of unacknowledged ICMP polls to three 
nodes or less. To limit the number of unacknowledged 
polls, the I CMP polls for each managed node are stored 
in memory (a pending polling queue) of the network 
management station and subsequently transferred to an 
active polling queue capable of queuing only three 
nodes. Thus, in the example of Fig. 1, the queue for 
node A is in queue 1, the queue for node B is in queue 
2, and the queue for node C is in queue 3. The three 
nodes in the active polling queue are then each polled 
with an ICMP poll. As a poll is acknowledged or in the 
event a node is declared down, the queue is cleared and 
She next in line node is placed in the active polling queue. 
A ping is 1hen sent to the next in line node. 

Using the above queueing configuration, if for ex- 
ample three failed nodes are polled in rapid succession, 
the status of other nodes cannot be ascertained for at 
least the next 2.5 minutes, since no more than three 
nodes may have unacknowledged polls concurrently. 
Similarly, it may take 5 minutes to diagnose the failure 
of six nodes in succession. II may take 7.5 minutes to 
diagnose the failure of nine nodes. As a result, the dis- 
covery and/of status polling process performed by the 
network management station could be substantially do- 
layod, thus increasing the elapsed time used by the net- 
work management station to perform network manage- 
ment tasks. Further, the topology map may be delayed 
in being updated, thus increasing the time lo diagnose 
the problem with the network. 

With the increase in size and use of internets, the 
management of such networks has become increasing- 
ly difficult. The resulting increase in the number of nodes 
increases the possibility of polling several failed nodes 
i n sequ ence. Currently, a fai lure of multiple nodes would 
cause the discovery procedure to be effectively frozen 
as described above. 

It is an object of Ihe presenl invention to provide an 
alternative technique for verifying the operational status 
of network nodes in order to reduce the elapsed time of 
network discovery and the elapsed time of status polling 
and lo rapidly provide network configuration updates 
which may be displayed on the topology map and assist 
network management personnel in troubleshooting fail- 
ures more rapidly. 

According to one aspect of the present invention 
there is provided a method for monitoring nodes in a net- 
work having at least one network management station 
and a plurality of nodes, characterised by 1he steps of: 
providing a queue of polling messages for transmission 
to the nodes, each polling message being indexed by 



the network address of one of the nodes; sending said 
queued polling messages from the network manage- 
ment station to the plurality of nodes at a predetermined 
rate; recording transmission ol ihe polling messages in 

5 a table having a first portion indexed by the network ad- 
dress of each node and a second portion indexed by a 
timeout associated with a polling message count for 
each node having an outstanding status poll; and deter- 
mining lor each node if that node has failed after a pre- 

10 determined number of polling messages have been sent 
to that node. 

According 1o another aspect of the present inven- 
tion there is provided a system for managing a network 
comprising at least one network management station, 

15 and a plurality of nodes connected to the network man- 
agement station for data communications there be- 
tween, characterized in that each said network manage- 
ment station includes a queue of polling messages for 
transmission to Ihe nodes, and a poll table having a first 

20 portion indexed by a network address of each node and 
a second portion indexed by a limeoul associated with 
a polling message count; and wherein said network 
management slation determines for each node if 1ha! 
' node has failed after a predetermined number of polling 

2$ messages have been sent to that node within an 
elapsed timeoul period. 

One embodiment of the present invention will now 
be described by way of example with reference to the 
accompanying drawings in which:- 

30 

Fig. 1 is a block diagram of a known polling queue 

for determining the status ol nodes; 

Fig. 2 is a block diagram of an exemplary network 

topology; 

35 Fig. 3 is a block diagram of a status poll transmis- 
sion mechanism according to the present invention; 
Fig. 4 is a block diagram ol a stalus polt transmis- 
sion queue and an unacknowledged poll tabie ac- 
cording to the present invention; 

40 Fig. 5 is a block diagram of an unacknowledged poll 
fable according to the presenl invention; 
Fig. 6 is a block diagram of the exemplary network 
topology ol Fig. 2, illustrating a failed managed 
node and olher nodes and links affected by Ihe 

45 failed node; and 

Fig. 7 is a flow diagram for the operation of the net- 
work management station during discovery and sta- 
tus verification. 

5o The present invention provides a network manage- 
ment method and system which improves the discovery 
process and the slalus monitoring process of current 
network management systems. It should be noted that 
the folbwing description is based on a communication 

55 network using the TCP/IP communication protocol and 
a network managing framework using the SNMP proto- 
col. However, the invention is applicable to network 
management environments based on other network 
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configurations using oiher types of communication and 
management protocols as well. 

As noted above, during node discovery or during 
node status verification, the network manager sends IC- 
MP polls (pings) to each node identified in, for example, 
the ARP cache and any known router tables. The node 
manage r then waits for a response from th e target node. 
The response may include information, such as the 
node IP address, status information regarding the node, 
the type of node (e.g. computer, router or hub), and the 
number of interfaces in the node. When a response is 
received, the node information is stored in an IP topol- 
ogy database. Typically, the decision to manage a node 
is made by network management personnel or the net- 
work management station. 

Preferably, the IP topology database consists of a 
series of tables, each containing specific information re- 
garding the network. For example, a network table con- 
tains information associated with each network in the 
topology This information may inciude the type of net- 
work, IP address, subnet mass, and the times associat- 
ed with the creation and modification of the network en- 
try in the table. A segment table contains information 
associated with each segment (or subnet) in the network 
topology. This information may include the name of the 
subnet, number of interfaces connected to the subnet, 
and the times associated with the creation and modifi- 
cation of the subnet entry in the table. A node table con- 
tains information associated with each node in the net- 
work topology. The node information may include, for 
example, the IP network manager, a SNMP system de- 
scription, the number of interfaces in the node, and 
times associated whh the creation and modification of 
the node entry in the table. The information stored in the 
IP topology database is primarily obtained from the dis- 
covery process, but may also be entered from network 
management personnel. 

From the IP topology database, an IP topology map 
can be created. The IP map is a map of the network 
topology which places the discovered nodes in appro- 
priate subnets, networks, and/or internets depending 
upon the level of the topology being mapped, f n the sys- 
tem of the present invention, the IP map is preferably 
updated as the status ol a node changes. The IP map 
displays the different nodes using icons or symbols that 
represent the node from, for example, an SNMP MIES 
file. 

As discussed above, some current network man- 
agement systems limit the number of unacknowledged 
pins to three nodes so as to prevent flooding the network 
with pings. 

Referring nowto Fig. 1 , a block diagram of the queu- 
ing sequence for sending pings to different nodes is 
shown. Queues1,2and3 store the p ing count for nodes 
A, B and C respectively. The queues are not cleared until 
the ping is acknowledged or when the time for each ping 
expires, i.e. a timeout occurs for the Mth ping, and the 
node is declared to have failed. Thus, a ping cannot be 
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sent to node D until one ol the queues is cleared. 

Referring to Fig. 2, ablockdiagram of an exemplary 
network topology is shown to which the present inven- 
tion may apply. 
$ The network manager according to the presenl in- 
vention provides a status poll transmission queue which 
speeds the processing of acknowledgements by storing 
unacknowledged pings in an ordered data table of arbi- 
trary size indexed by the IP address of each target node. 
to The size of the data table may be fixed or it may vary. 
To speed the management of limeouts, unacknowl- 
edged pings are also stored in an ordered data table 
indexed by the time by which a timeout is scheduled to 
occur for a particular ping. Each record in each data ta- 
rs ble contains a pointer to a corresponding record in the 
other table to facilitate rapid removal ol the managed 
node from the queue in the event a timeout occurs for 
the Mth ping, or upon receipt of an acknowledgement of 
the ping, whichever occurs first. 
20 Referring to Figs. 3-5, a status poll transmission 
mechanism and queue for nodes A-Z are illustrated, 
where A-Z represent the identity of each node assigned 
to the node manager. The status poll transmission 
queue 10 identifies the nodes which are scheduled 10 
2S be polled. The status poll transmission queue 1 0 stores 
the node identity of the nodes which are awaiting trans- 
mission of a poll, and is preferably a FIFO (first in first 
out) queue or a FCFS (first come first serve) queue. 
However, other types of queues may be utilised, e.g., a 
so l.CFS (last come first serve) queue. A queue might also 
be ordered by some attribute of the objects waiting in it, 
such as priority class or node type. A rate control mech- 
anism 12 controls the rate a1 which the pings are sent 
on the network to the nodes. As the pings are sent, 
35 records of the transmission of the pings are stored in an 
unacknowledged poll table (see Figs. 4 and 5). As not- 
ed, the unacknowledged poll table consists of two data 
records (an IP record and a timeout record) that are con- 
figured to allow an arbitrary number of nodes 1o be 
40 polled concurrently without receiving an acknowledge- 
ment. This configuration allows many status polls to be 
outstanding (unacknowledged) at one time. The rate 
control mechanism 12 (see Fig. 3) prevents the network 
from being flooded with pings. Combining the utilisation 
-*s of the unacknowledged poll table configuration with the 
rate control mechanism 1 2 allows the network to be dis- 
covered rapidly even when status polls are unacknowl- 
edged for long periods of lime. As seen in Fig. 4, the IP 
record is indexed by the IP address of the target nodes, 
so andlhe timeout record is indexed by the scheduled time- 
out for the particular ping being transmitted. The timeout 
record also includes a poll count record. The scheduled 
timeout is the time period between successive pings tar- 
geted at a particular node. The poll count record repre- 
ss sents an arbitrary number of pings that have been sent 
to the target node before the node is determined to have 
failed. The maximum ping count may be set by network 
management personnel or, more usually, by thedesign- 
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er of a network management system. Various factors, 
such as the acknowledgement return time and the prob- 
ability of packet loss, are considered when determining 
the ping count. The acknowledgement return time is the 
time il takes for the acknowledgement to be received by 
the network management station. 

The scheduled timeout may be set to a fixed, pre- 
determined period of time between each ping. Prefera- 
bly, the scheduled timeout between pings varies de- 
pending upon the ping count. For example, in a config- 
uration where the ping count is four, the scheduled time- 
out between a first ping and a second ping may be set 
to about ten seconds, the timeout belween the second 
ping and a third ping may be set to about twenty sec- 
onds, the timeout between the third ping and the fourth 
ping may be set 1o about forty seconds, and the time 
between the fourth ping and the declaration of a failed 
node may be set to about eighty seconds. 

Once a prescribed sequence of timeouts has been 
recorded by the network management station, the node 
is declared to have failed and the change in status of 
the network is stored in the IP topology database and 
reflected in the IP map. 

Referring to Fig. 6, an exemplary network topology 
map is illustrated wherein the hub and its associated 
managed nodes were determined to have failed to ac- 
knowledged the pings. 

During the discovery process the IP addresses of 
new nodes arrive in bulk on retrieved list (ARP cache) 
causing status polling requests (pings) of previously un- 
known nodes to be generated in bursts. To prevent the 
consequent pings messages from flooding the network, 
the system of the present invention regulates the trans- 
mission of the pings. That is, the system of the present 
invention schedules the pings for transmission in rapid 
succession at a controlled rate which may be user spec- 
ified. The controlled rate oi ping transmission may be 
dependent upon various factors including, for example, 
the current pay load on the network, the current spare 
capacity on the network, and the buffer size in the por- 
tion of the kernel of the network management station's 
operating system lhal supports network activity. Prefer- 
ably, the rate is no faster than that at which the kernel 
(i.e. the portion of the operating system of the network 
management stalion that supports process manage- 
ment and some other system functions) can handle ac- 
knowledgements. Alternatively, the rate may be auto- 
matically adjusted as the ability oi the kernel to handle 
acknowledgements changes. For example, if the spare 
capacity of the network increases, or if the pay load on 
the network decreases, the rate at which pings may be 
sent also may be increased. Alternatively, if the spare 
capacity of the network decreases, or if the payload on 
the network increases, the rate at which pings may be 
sent may also be decreased. 

As noted, to prevent a flood of pings on the network 
the pings are scheduled for transmission in rapid suc- 
cession at the controlled rate using, for example, the 



rate control mechanism. One method for monitoring the 
throughput of pings is similar to the "leaky bucket" mon- 
itoring algorithm used to provide a sustained th roughput 
for the transmission of asynchronous transfer mode 

s (ATM) cells in an ATM network. A description of 1he leaky 
algorithm can be Sound in "Bandwidth Management: A 
Congestion Control Strategy for Broadbend Pocket Net- 
works-Characterizing the Throughput-burstiness Fil- 
ler", by A.E. Eckberg, D.J. Luan and D.M. Lucantoni, 

10 Computer Networks and ISDN Systems 20 (1990) pp. 
415-423, which is incorporated herein by reference. 
Generally, in the "leaky bucket" algorithm, a set of 
number of pings are transmitted within a specified time 
frame, and pings in excess of this number can be 

'5 queued. As noted, 1he controlled rate can be set by net- 
work management personnel or can be automaticaily 
adjusted by the network management station. 

Relerring to Fig. 7, a flow diagram of the operation 
of the network management station during discovery 

20 and status verification is shown. Initially, in discovery the 
network management station receives ARP caches and 
router tables from various nodes on the network via a 
configuration poll. The ARP caches and routing tables 
provide the network management station with, for ex- 

25 ample, the IP address of nodes along the network. The 
information obtained from the ARP cache and the roul- 
ing tables is then stored in an IP topology database, As 
noted, the determination to manage the node is made 
by the network management station or network man- 

30 agement personnel. 

To verify the status of nodes, the IP addresses of 
the known nodes are stored in, for example a status poll 
transmission queue (seen in Fig. 3) which identifies the 
nodes thai are to be polled (step 514). When the network 

35 management station is performing status verification 
tasks, pings are sent to the newly discovered nodes and 
nodes identified in Ihe status poll transmission queue al 
the designaled IP addresses (step 516). As discussed 
above, the pings are sent in a conlrolled sequence at a 

40 predetermined rale. 

As the pings are sent, the IP address associated 
with each polled node is stored in IP record of an unac- 
knowledged poll lable. Simultaneously, a poll count 
record in a timeout record of Ihe unacknowledged poll 

45 table is incremented by one and the timeout becomes 
the timeout associated with the new poll count (step 
518). Thereafter, the IP address for the node is deleted 
Irom the status poll transmission queue (step 520). 
Once the ping is sent and ihe IP address for the node 

so is deleted from the queue, the syslem goes into a sleep 
mode with respect io the particular node until ihe ping 
is acknowledged or a corresponding timeout occurs, 
whichever occurs first (step 522). For each node in the 
newly retrieved ARP cache that is not known to the net- 

55 work management database, a status poll (ping) is sent 
in accordance with step 51 4 above. If the ping has been 
acknowledged, the network management station pref- 
erably deletes the IP record and limeout records in the 
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unacknowledged poll lable (step 524}. 

If 1he scheduled timeout for a ping occurs first, the 
network management station retrieves the ping count 
from the ping count record (step 526) and determines it 
Ihe ping count matches Ihe predetermined number of 
counts, i.e. the station determines if the ping count is at 
the maximum number (step 528). If the ping count does 
not match the predetermined count number, the IP ad- 
dress for the node is stored in the status poll transmis- 
sion queue (step 51 4) and a new ping is sent to the same 
target node and the network management station re- 
peals the steps, as shown in Fig. 7. 

If at step 526 the ping count does match the prede- 
termined count number, then the node is determined to 
have failed (step 530). Thereafter, the IP topoiogy data- 
base is updated wilh the change in status of the node. 
The record for that node is then removed from the status 
poll transmission queue and acknowledged pol! table 
(step 532). 

This process can be performed concurrently for 
many nodes thus reducing the delay until each man- 
aged node is polled and increasing the currency of the 
IP topology map. 



Claims 

1 . A method for monitoring nodes in a network having 
at least on c n ctwork managem ent station and a plu- 
rality of nodes, characterised by the steps of> 

providing a queue (10) of polling messages for 
transmission to the nodes, each polling mes- 
sage being indexed by the network address of 
one of the nodes; 

sending said queued polling messages from 
the network management station to the plurality 
ot nodes at a predetermined rate; 
recording transmission of the polling messages 
in a table having a first portion indexed by the 
network address of each node and a second 
portion indexed by a timeout associated with a 
polling message count for each node having an 
outstanding status poll; and 
determining for each node if that node has 
failed after a predetermined number of polling 
messages have been sent to that node. 



node are unacknowledged and the polling mes- 
sage count reaches the predetermined number 
and the timeout has expired She node is deter- 
mined to have failed. 

3. A melhod according to claim 2, characterized in that 
the elapsed timeout period is about 2.5 minutes. 

4. A melhod according to any preceding claim, 
10 characterized by the steps of;- 

deleting the network address of a node from 
said queue (10) after a polling message is 
transmitted to that node; and 
is if that polling message is u nacknowl edged and 

the node was determined not to have failed 
then adding the network address of that node 
to said queue so that another polling message 
is sent to thai node. 

20 

5. A melhod according to any preceding claim, 
characterized in that the polling message is an in- 
ternet control message protocol polling message; 
the network address is an internet protocol address; 

2$ and said predetermined number of polling messag- 
es is four. 

6. A system for managing a network comprising at 
least one network management station, and a plu- 

30 rality of nodes connected to the nelwork manage- 
ment station for data communications there be- 
tween, characterized in that each said network 
management station includes a queue (10) of poll- 
ing messages for transmission to the nodes, and a 

35 poll lable having a first portion indexed by a network 
address of each node and a second portion indexed 
by a timeout associated with a polling message 
count; and wherein said network management sta- 
tion determines for each node if that node has failed 

to after a predetermined number of polling messages 
have been sent to that node within an elapsed time- 
out period, 

7. A system according to ciaim 6, characterized in that 
is said network management station has a rate control 

mechanism (12) for controlling the rate at which 
polling messages are transmitted. 



2. A method according to claim 1 , wherein said step 
of determining if a node has failed is characterized so 
by the steps of:- 



determining if the count of polling messages 
sent to that node has reached the predeter- 
mined number; and 

determin in g if an e lapsed timeout period for that 
particular polling message count has expired, 
such that when the polling messages sent to a 
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FIG. 3 
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RECORD PING IN IP AND — 518 

TIMEOUT RECORDS; 
INCREMENT PING COUNT; 
SCHEDULE NEXT TIMEOUT 



DELETE NODE FROM 

STATUS POLL 
TRANSMISSION QUEUE 



SLEEP UNTIL PING 
ACK. OR TIMEOUT 
WHICHEVER 
OCCURS FIRST 



~522 



PING ACK. - 



v TIMEOUT 



DELETE THE 
IP AND TIMEOUT 
RECORDS FROM 
UNACKNOWLEDGED 

POLL TABLE 



— 524 



| RETRIEVE PING COUNT | 




532 — 







DELETE THE IP 
AND TIMEOUT 
RECORDS FROM 
UNACKNOWLEDGED 
POLL TABLE 







(END FOR THIS NODE) 
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